39 research outputs found

    Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks

    Full text link
    Malware still constitutes a major threat in the cybersecurity landscape, also due to the widespread use of infection vectors such as documents. These infection vectors hide embedded malicious code to the victim users, facilitating the use of social engineering techniques to infect their machines. Research showed that machine-learning algorithms provide effective detection mechanisms against such threats, but the existence of an arms race in adversarial settings has recently challenged such systems. In this work, we focus on malware embedded in PDF files as a representative case of such an arms race. We start by providing a comprehensive taxonomy of the different approaches used to generate PDF malware, and of the corresponding learning-based detection systems. We then categorize threats specifically targeted against learning-based PDF malware detectors, using a well-established framework in the field of adversarial machine learning. This framework allows us to categorize known vulnerabilities of learning-based PDF malware detectors and to identify novel attacks that may threaten such systems, along with the potential defense mechanisms that can mitigate the impact of such threats. We conclude the paper by discussing how such findings highlight promising research directions towards tackling the more general challenge of designing robust malware detectors in adversarial settings

    PowerDrive: Accurate De-Obfuscation and Analysis of PowerShell Malware

    Get PDF
    PowerShell is nowadays a widely-used technology to administrate and manage Windows-based operating systems. However, it is also extensively used by malware vectors to execute payloads or drop additional malicious contents. Similarly to other scripting languages used by malware, PowerShell attacks are challenging to analyze due to the extensive use of multiple obfuscation layers, which make the real malicious code hard to be unveiled. To the best of our knowledge, a comprehensive solution for properly de-obfuscating such attacks is currently missing. In this paper, we present PowerDrive, an open-source, static and dynamic multi-stage de-obfuscator for PowerShell attacks. PowerDrive instruments the PowerShell code to progressively de-obfuscate it by showing the analyst the employed obfuscation steps. We used PowerDrive to successfully analyze thousands of PowerShell attacks extracted from various malware vectors and executables. The attained results show interesting patterns used by attackers to devise their malicious scripts. Moreover, we provide a taxonomy of behavioral models adopted by the analyzed codes and a comprehensive list of the malicious domains contacted during the analysis

    Adversarial Detection of Flash Malware: Limitations and Open Issues

    Full text link
    During the past four years, Flash malware has become one of the most insidious threats to detect, with almost 600 critical vulnerabilities targeting Adobe Flash disclosed in the wild. Research has shown that machine learning can be successfully used to detect Flash malware by leveraging static analysis to extract information from the structure of the file or its bytecode. However, the robustness of Flash malware detectors against well-crafted evasion attempts - also known as adversarial examples - has never been investigated. In this paper, we propose a security evaluation of a novel, representative Flash detector that embeds a combination of the prominent, static features employed by state-of-the-art tools. In particular, we discuss how to craft adversarial Flash malware examples, showing that it suffices to manipulate the corresponding source malware samples slightly to evade detection. We then empirically demonstrate that popular defense techniques proposed to mitigate evasion attempts, including re-training on adversarial examples, may not always be sufficient to ensure robustness. We argue that this occurs when the feature vectors extracted from adversarial examples become indistinguishable from those of benign data, meaning that the given feature representation is intrinsically vulnerable. In this respect, we are the first to formally define and quantitatively characterize this vulnerability, highlighting when an attack can be countered by solely improving the security of the learning algorithm, or when it requires also considering additional features. We conclude the paper by suggesting alternative research directions to improve the security of learning-based Flash malware detectors

    Di fronte al rebus del mondo: il postmoderno e la narrativa di Antonio Tabucchi

    Get PDF
    In questa tesi, la narrativa di Antonio Tabucchi è stata analizzata e messa in dialogo con alcune delle principali categorie filosofiche ed estetico-letterarie che si collegano al concetto di postmoderno. L’opera letteraria di Tabucchi, come quella degli scrittori autenticamente postmoderni, dimostra di fare profondamente i conti con la crisi dell’esistenza tipica del nostro tempo – pur se, naturalmente, insieme universale -; crisi che nasce dalla caduta dei fondamenti – o “grandi racconti” – del pensiero metafisico e dell’umanesimo moderno, e dalla contestuale affermazione dei paradigmi della razionalità tecnologico-strumentale – in questo senso, l’epoca attuale è stata descritta come post-modernità o “età della tecnica”. La prima parte di questo studio indaga le radici filosofiche e i temi salienti – quali il senso, l’identità e la storia – legati alla condizione postmoderna; la seconda parte è sede di un’analisi letteraria, prevalentemente tematica, dell’opera di Tabucchi, che cerca di dimostrare come in essa operino una poetica e insieme un’etica tipicamente postmoderne: la letteratura ha, oggi, il compito - e il dovere - di interrogare – e testimoniare - l’enigmatico non-senso del mondo contemporaneo, offrendoci una chance, per quanto precaria e paradossale, di “abitarlo”, coltivando l’utopia di un tempo umano in cui continuare a raccontar(ci) e a sognar(ci).Nesta tese, a ficção de Antonio Tabucchi foi analisada e dialogada com algumas das principais categorias filosóficas e estético-literárias que se vinculam ao conceito de pós-modernismo. A obra literária de Tabucchi, como a de escritores autenticamente pós-modernos, prova lidar profundamente com a crise de existência típica de nosso tempo - embora, é claro, ao mesmo tempo universal -; crise que surge da queda dos alicerces - ou “grandes histórias” - do pensamento metafísico e do humanismo moderno, e da afirmação contextual dos paradigmas da racionalidade tecnológico-instrumental - neste sentido, a era atual tem sido descrita como pós -modernidade ou "era da técnica". A primeira parte deste estudo investiga as raízes filosóficas e temas salientes - como significado, identidade e história - relacionados à condição pós-moderna; a segunda parte é a sede de uma análise literária, sobretudo temática, da obra de Tabucchi, que procura demonstrar como nela operam uma poética e uma ética tipicamente pós-moderna: a literatura hoje tem a tarefa - e o dever - de questionar - e de testemunhar - o enigmático insensatez do mundo contemporâneo, oferecendo-nos uma oportunidade, ainda que precária e paradoxal, de "habitá-lo", cultivando a utopia de um tempo humano em que continuar a contar e a sonhar.In this thesis, Antonio Tabucchi's fiction has been analyzed and put into dialogue with some of the main philosophical and aesthetic-literary categories that are linked to the concept of postmodernism. Tabucchi's literary work, like that of authentically postmodern writers, proves to deal deeply with the crisis of existence typical of our time - albeit, of course, at the same time universal -; crisis that arises from the fall of the foundations - or "great stories" - of metaphysical thought and of modern humanism, and from the contextual affirmation of the paradigms of technological-instrumental rationality - in this sense, the current era has been described as post-modernity or "age of the technique". The first part of this study investigates the philosophical roots and salient themes - such as meaning, identity and history - related to the postmodern condition; the second part is the seat of a literary analysis, mainly thematic, of Tabucchi's work, which tries to demonstrate how a poetics and a typically postmodern ethics operate in it: literature today has the task - and the duty - to question - and testify - the enigmatic non-sense of the contemporary world, offering us a chance, however precarious and paradoxical, to "inhabit it", cultivating the utopia of a human time in which to continue to narrate and dream

    On the Feasibility of Adversarial Sample Creation Using the Android System API

    Get PDF
    Due to its popularity, the Android operating system is a critical target for malware attacks. Multiple security efforts have been made on the design of malware detection systems to identify potentially harmful applications. In this sense, machine learning-based systems, leveraging both static and dynamic analysis, have been increasingly adopted to discriminate between legitimate and malicious samples due to their capability of identifying novel variants of malware samples. At the same time, attackers have been developing several techniques to evade such systems, such as the generation of evasive apps, i.e., carefully-perturbed samples that can be classified as legitimate by the classifiers. Previous work has shown the vulnerability of detection systems to evasion attacks, including those designed for Android malware detection. However, most works neglected to bring the evasive attacks onto the so-called problem space, i.e., by generating concrete Android adversarial samples, which requires preserving the app’s semantics and being realistic for human expert analysis. In this work, we aim to understand the feasibility of generating adversarial samples specifically through the injection of system API calls, which are typical discriminating characteristics for malware detectors. We perform our analysis on a state-of-the-art ransomware detector that employs the occurrence of system API calls as features of its machine learning algorithm. In particular, we discuss the constraints that are necessary to generate real samples, and we use techniques inherited from interpretability to assess the impact of specific API calls to evasion. We assess the vulnerability of such a detector against mimicry and random noise attacks. Finally, we propose a basic implementation to generate concrete and working adversarial samples. The attained results suggest that injecting system API calls could be a viable strategy for attackers to generate concrete adversarial samples. However, we point out the low suitability of mimicry attacks and the necessity to build more sophisticated evasion attacks

    PowerDecode: a PowerShell Script Decoder Dedicated to Malware Analysis

    Get PDF
    In recent years, PowerShell-based attacks have been widely employed to compromise systems’ security. Attackers can easily hide such malicious scripts in file formats (e.g., Office document macros) that can be easily delivered via large-scale spam mail campaigns. Moreover, attackers employ obfuscation techniques that make the PowerShell code able to evade the most common anti-malware protections and perform unauthorized actions that will target the confidentiality, integrity and availability of an information system. In this paper, we present PowerDecode, an open-source module for the de-obfuscation and the analysis of PowerShell scripts. In particular, this module receives a script as an input and returns its obfuscated layers, its original de-obfuscated variant and a report about possible malicious activities. We tested PowerDecode on almost 3000 malicious scripts and the attained results showed significantly improved de-obfuscation performances in comparison to state-of-the-art systems. More specifically, PowerDecode was able to resolve multiple types of obfuscation and collect important information about attacks, such as malicious URLs and IP addresses contacted by malware. Finally, PowerDecode can be easily integrated in other malware analysis systems, and can represent a precious aid to identify malicious activities

    On the Effectiveness of System API-Related Information for Android Ransomware Detection

    Get PDF
    Ransomware constitutes a significant threat to the Android operating system. It can either lock or encrypt the target devices, and victims are forced to pay ransoms to restore their data. Hence, the prompt detection of such attacks has a priority in comparison to other malicious threats. Previous works on Android malware detection mainly focused on Machine Learning-oriented approaches that were tailored to identifying malware families, without a clear focus on ransomware. More specifically, such approaches resorted to complex information types such as permissions, user-implemented API calls, and native calls. However, this led to significant drawbacks concerning complexity, resilience against obfuscation, and explainability. To overcome these issues, in this paper, we propose and discuss learning-based detection strategies that rely on System API information. These techniques leverage the fact that ransomware attacks heavily resort to System API to perform their actions, and allow distinguishing between generic malware, ransomware and goodware. We tested three different ways of employing System API information, i.e., through packages, classes, and methods, and we compared their performances to other, more complex state-of-the-art approaches. The attained results showed that systems based on System API could detect ransomware and generic malware with very good accuracy, comparable to systems that employed more complex information. Moreover, the proposed systems could accurately detect novel samples in the wild and showed resilience against static obfuscation attempts. Finally, to guarantee early on-device detection, we developed and released on the Android platform a complete ransomware and malware detector (R-PackDroid) that employed one of the methodologies proposed in this paper
    corecore